Skip to content

fix(observer): ocr_enabled flag — bypass screencapture popup storm#10

Merged
AVADSA25 merged 1 commit into
mainfrom
hotfix/observer-ocr-disable
May 2, 2026
Merged

fix(observer): ocr_enabled flag — bypass screencapture popup storm#10
AVADSA25 merged 1 commit into
mainfrom
hotfix/observer-ocr-disable

Conversation

@AVADSA25

@AVADSA25 AVADSA25 commented May 2, 2026

Copy link
Copy Markdown
Owner

Summary

User report (2026-05-02 morning): after merging Step 5 and starting codec-observer, the macOS Screen Recording permission popup fired "over a hundred times" — the user clicked Allow but it kept coming back.

Root cause

Two-part bug in _get_screenshot_ocr:

  1. macOS Allow is one-shot. Persistent grant requires the System Settings toggle, not the popup's Allow button. Clicking Allow on the popup only authorizes that single call.

  2. ThreadPoolExecutor.__exit__ blocks on shutdown. When the popup appears, screencapture's subprocess hangs waiting for the user. My code:

    with ThreadPoolExecutor(max_workers=1) as ex:
        future = ex.submit(_screencapture_and_ocr_blocking)
        return future.result(timeout=timeout_s)

    __exit__ calls shutdown(wait=True) — blocks until the thread finishes. The thread can't finish until the popup is dismissed. The "100ms timeout" actually waited ~5s (subprocess.run's own 2s+3s chain). And the Q5.1 retry triggered a SECOND popup. Per poll: ~2 popups + 5s blocking.

Fix

Add ocr_enabled config flag (default True — preserves Step 5 behavior on permission-granted installs). When false, poll() skips _get_screenshot_ocr entirely. Buffer still captures active_window + clipboard + recent_files; only OCR is bypassed.

// ~/.codec/config.json
"observer": {
    "ocr_enabled": false   // disable per-machine until permissions granted
}

Test plan

  • 3 new tests added in tests/test_observer.py:
    • test_ocr_enabled_false_skips_screenshot_call — sentinel-based, raises if _get_screenshot_ocr is called despite the flag
    • test_ocr_enabled_default_is_true — preserves existing behavior
    • test_ocr_enabled_true_calls_screenshot — sanity check for the default path
  • pytest tests/test_observer.py33/33 passing in 0.21s. Zero side effects (Apple Reminders=0, /tmp/codec_*.txt=0).

Deployment

This PR ships the code change. The user's runtime config has ALREADY been edited to ocr_enabled: false (atomic write, backup at ~/.codec/config.json.bak-1777736987). After merge:

pm2 restart codec-observer

Observer resumes polling without OCR. No popups possible.

Permanent fix (out of scope for this hotfix)

Two follow-ups for a future PR:

  1. Permissions docs in AGENTS.md: how to grant Screen Recording + Accessibility to node AND python3.13 via System Settings. Persistent grant survives restarts.
  2. Subprocess-leak fix: replace ThreadPoolExecutor + future.result(timeout=...) with subprocess.Popen + explicit proc.kill() on timeout. The current shutdown-wait pattern is the underlying root cause; the ocr_enabled flag is just the immediate off-switch.

🤖 Generated with Claude Code

User report (2026-05-02): after merging Step 5 and starting codec-observer,
the screencapture-based OCR fired the macOS Screen Recording permission
popup repeatedly — "over a hundred times" per user. Each click of "Allow"
on the popup grants only one-time access; the persistent grant is the
toggle in System Settings → Privacy & Security → Screen & System Audio
Recording.

Compounding bug in codec_observer._get_screenshot_ocr:

  with ThreadPoolExecutor(max_workers=1) as ex:
      future = ex.submit(_screencapture_and_ocr_blocking)
      return future.result(timeout=timeout_s)

The `with` exit calls executor.shutdown(wait=True) which BLOCKS until
the thread finishes. When the popup appears, screencapture's subprocess
hangs waiting for the user, and the thread can't finish. Result: my
"100ms timeout" actually waited up to 5 seconds (full subprocess.run
timeout chain on screencapture+osascript), and BOTH the first attempt
AND the Q5.1 retry trigger their own popup. Per poll: ~2 popups +
5s blocking until user dismisses.

Hotfix: add `ocr_enabled` config flag (default True — preserves Step 5
design behavior on permission-granted installs) that, when set false,
skips _get_screenshot_ocr entirely. Buffer still captures active_window
+ clipboard + recent_files; only OCR is bypassed. No screencapture call,
no popup risk.

  ~/.codec/config.json:
    "observer": {
      "ocr_enabled": false   ← user can flip to disable OCR per-machine
    }

3 new tests in tests/test_observer.py:
  - test_ocr_enabled_false_skips_screenshot_call (sentinel-based: raises
    if _get_screenshot_ocr is called when flag is false)
  - test_ocr_enabled_default_is_true (preserves existing behavior)
  - test_ocr_enabled_true_calls_screenshot (sanity check)

Result: 33/33 tests passing in 0.21s. Zero side effects.

Permanent fix for OCR users: grant Screen Recording AND Accessibility to
both `node` (PM2 parent) and `python3.13` (codec-observer process) in
System Settings → Privacy & Security. Then the toggle stays on across
restarts.

Deeper subprocess-leak issue (ThreadPoolExecutor.with-exit blocks on
shutdown wait=True) is NOT fixed in this commit — would require restructuring
the OCR call to use Popen with explicit kill on timeout, or
ThreadPoolExecutor.shutdown(wait=False, cancel_futures=True) (Python 3.9+).
Tracked for follow-up. The ocr_enabled flag is the safe immediate fix.
@AVADSA25 AVADSA25 merged commit 26e6add into main May 2, 2026
1 check passed
AVADSA25 pushed a commit that referenced this pull request May 2, 2026
…(Steps 5/6/7)

Phase 2 (Observer + Triggers + Shift Report) merged and production-stable:
- Step 5 (PR #9 824a52f) + hotfix PR #10 (26e6add) — `observer.ocr_enabled` flag
- Step 6 (PR #11 2d2ff3f) — Trigger System (matcher + cooldown + consent)
- Step 7 (PR #12 0e40687) — end-of-day shift report

Net: +91 passing tests (823/20/73), 0 new failures, 0 new skips.
Live audit proof captured: shift_report_started+_completed paired emits at
2026-05-02T18:49:40Z with shared cid=5f188e5485e5.
AVADSA25 pushed a commit that referenced this pull request May 29, 2026
#10)

- tests/test_a12_invariant.py: fails if a new inline `.post(...chat/completions
  ...)` appears outside codec_llm. Allowlists the documented vision sites
  (dashboard, watcher, both bridges, screenshot_text) + codec_core's generated
  session-script string. Runs in CI via the full pytest suite (no separate
  workflow step needed). Includes a synthetic-violation test proving the guard
  is not a no-op.
- tests/test_oauth_provider.py: regression tests for the refresh-token
  scope-escalation defense (requested scopes must be a subset of the original
  grant) — previously untested.

permission_gate mutation cases are already covered by test_agent_runner.py
(8 tests incl. D-5 traversal/symlink); not duplicated. The flaky time.sleep ->
threading.Event refactor is deferred — no specific flaky test identified, and
rewriting passing tests without evidence risks regressions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AVADSA25 added a commit that referenced this pull request May 29, 2026
…al + highs) (#154)

* fix(concurrency): bound wall-clock time with run_with_timeout (C4)

The idiom `with ThreadPoolExecutor() as ex: fut.result(timeout=T)` defeats
its own timeout: the context-manager __exit__ calls shutdown(wait=True),
which blocks until the runaway task finishes. A 50ms timeout on a 5s task
took ~5s, so codec_mcp tool dispatch and codec_observer OCR could hang on a
slow skill or a stuck screencapture popup.

Add codec_concurrency.run_with_timeout (daemon thread + join(timeout), no
shutdown(wait=True)): surfaces the timeout promptly, abandons the daemon
worker (never blocks shutdown), re-raises fn exceptions, and raises
TimeoutError (== concurrent.futures.TimeoutError on py3.11+). Migrate both
call sites; mirrors the proven codec_hooks._run_hook_with_timeout pattern.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(state): serialize SQLite + JSON read-modify-writes (C5/H14/M6)

CodecMemory shared one sqlite3 connection across threads
(check_same_thread=False) with no app-level lock — concurrent save/search on
the same connection can corrupt cursor state or even segfault (the new test
reproduced a hard crash before the fix). Add a threading.RLock and serialize
every runtime connection use (save, search, search_recent, get_sessions,
cleanup, rebuild_fts, close). RLock so get_context->search and
cleanup->close don't self-deadlock; WAL + busy_timeout stay the
cross-process layer.

Route the unguarded ~/.codec JSON read-modify-writes through the
cross-process file_lock:
- grant_permission (routes/agents.py): new codec_agent_plan.grants_lock
  (mirrors _status_lock) held across load_grants -> modify -> save_grants ->
  set_grants_hash, so concurrent /grant calls can't clobber grants.json.
- both notifications.json writers in codec_ask_user (question-write +
  mark-read) now also hold file_lock(NOTIFICATIONS_PATH), matching the
  pairing the pending_questions path already used.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(ssrf): guard web_fetch + crew fetch against internal/metadata URLs (H1)

Attacker-influenceable URLs (chat injection, clipboard, crew task) could be
fetched against 127.0.0.1, 10/8, 192.168/16, 169.254.169.254 (cloud
metadata), etc. — and the body flows back into the LLM transcript, so a read
is an exfil path even with egress otherwise denied.

Add codec_ssrf.validate_url: enforces http(s), resolves the host, and rejects
if ANY resolved address is loopback / private / link-local / multicast /
reserved / unspecified (dual-record + IPv4-mapped-IPv6 aware). Wire it BEFORE
the HTTP call in skills/web_fetch.py (which clipboard_url_fetch delegates to)
and in codec_agents._web_fetch. DNS-rebinding TOCTOU limitation documented.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(skills): refuse approving a skill that shadows a pinned built-in (H2/H6)

The review-and-approve flow wrote an approved skill to the skills dir by
basename with no name-collision check. An approved skill named after a
hash-pinned built-in (calculator.py, file_write.py, ...) would take that
trusted name — shadowing the real built-in, or overwriting it if the write
dir is the repo skills dir.

skill_approve now refuses any filename matching a name in the committed
skills/.manifest.json (case-insensitive, so the guard also holds on
case-insensitive filesystems), emitting skill_approve_blocked. PR-1A's
load-time hash/AST gate remains the last line of defense; this refuses at the
write point so the trusted file is never displaced.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test: A-12 inline-LLM-POST guard + OAuth scope-escalation coverage (Fix #10)

- tests/test_a12_invariant.py: fails if a new inline `.post(...chat/completions
  ...)` appears outside codec_llm. Allowlists the documented vision sites
  (dashboard, watcher, both bridges, screenshot_text) + codec_core's generated
  session-script string. Runs in CI via the full pytest suite (no separate
  workflow step needed). Includes a synthetic-violation test proving the guard
  is not a no-op.
- tests/test_oauth_provider.py: regression tests for the refresh-token
  scope-escalation defense (requested scopes must be a subset of the original
  grant) — previously untested.

permission_gate mutation cases are already covered by test_agent_runner.py
(8 tests incl. D-5 traversal/symlink); not duplicated. The flaky time.sleep ->
threading.Event refactor is deferred — no specific flaky test identified, and
rewriting passing tests without evidence risks regressions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: design notes for Fix #8 (routes/chat extraction) + Fix #9 (jsonstore registry)

Design-only, per CLAUDE.md §11 + the SECURITY-REMEDIATION-DESIGN gate for the
two remaining structural fixes. No production code. Each note grounds scope,
seam, migration phasing, test plan, rollback, and open decisions in the actual
HEAD surface (codec_dashboard chat cluster ~lines 2207-3005; ~30 ad-hoc
json.dump writers + 2 duplicate atomic helpers). Awaiting sign-off before
implementation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(state): codec_jsonstore Phase 0 hardening + notifications RMW lock + registry (C8)

Fix #9, scoped to the high-value, low-risk core (recommended decisions applied):

- Phase 0: atomic_write_json gains `default=` (custom serializer, e.g. str for
  datetime) and `sort_keys=` passthrough; new read_modify_write(path, fn)
  standardizes lock+load+mutate+atomic-write so a future RMW can't forget the
  file_lock.
- Phase 2: routes/_shared._save_notification now holds the cross-process
  file_lock across its load->insert->write (it ran under the in-process
  _notif_lock only, so separate PM2 daemons clobbered each other).
- Phase 5: docs/STATE-FILES.md registry of every ~/.codec state file + lock
  policy + migration status.

Deferred to a documented follow-on (see STATE-FILES.md): Phase 1 mass
overwrite-durability swaps; Phase 3 helper convergence (entangles the
don't-touch pending_questions/grants write paths — byte-preserving, reviewed
per-helper); Phase 4 don't-touch files (config.json, OAuth/Google tokens) with
per-file sign-off. The broad json.dump CI ratchet was rejected as low-signal
(rationale in STATE-FILES.md).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(dashboard): extract chat_completion phases into helpers (C9 start)

Fix #8, intra-file CC reduction. The routes/chat.py module split is deferred:
chat_completion shares _load_prompt_overrides + CHAT_SYSTEM_PROMPT with the
prompt-override endpoints, so a clean split needs those relocated first
(tracked in docs/FIX8-ROUTES-CHAT-EXTRACTION-DESIGN.md).

Extract two of the heaviest blocks out of the 356-line chat_completion into
named, independently-testable helpers — behavior-preserving (the 48-test
chat/stream/budget/escalation baseline is byte-identical before/after):
- _build_chat_system_prompt(config, budget, has_attachment, last_user_text):
  override + step-budget + attachment / content-rewrite / observer-inject logic.
- _chat_vision_response(body, messages): the image -> vision-model branch.

chat_completion: 356 -> 252 lines. Adds tests/test_chat_helpers.py (5 unit
tests now possible because the logic is no longer trapped inside the handler).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(ws): authenticate /ws/voice handshake (re-audit N1, Critical)

AuthMiddleware subclasses Starlette BaseHTTPMiddleware, which only runs on the
`http` scope — so the /ws/voice WebSocket handshake was completely
unauthenticated. Combined with the voice pipeline running any trigger-matched
skill (terminal/file_ops/pilot, no allowlist), an exposed dashboard (Cloudflare
tunnel / dashboard_host=0.0.0.0) was an open, auth-free path to code execution.

Add _ws_authorized() mirroring the HTTP AuthMiddleware Layers 0/1/2 (open when
nothing configured; else dashboard_token via ?token=/Bearer, OR a valid
TOTP-verified biometric session cookie) and reject before accept(). The
middleware author already intended /ws to be gated (the 401 branch checks
path.startswith("/ws")) — it just never executed on the websocket scope.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(ssrf): re-validate every redirect hop in web_fetch + crew fetch (re-audit N3)

Fix #7 validated only the INITIAL URL, but both fetch sites follow redirects by
default (requests allow_redirects=True; httpx follow_redirects=True), so a
public URL returning 302 -> http://169.254.169.254/ (cloud metadata) or
http://127.0.0.1/ reached the internal target unchecked and returned its body
into the LLM transcript — defeating the H1 guard.

Both sites now follow redirects MANUALLY with allow_redirects/follow_redirects
False, calling codec_ssrf.validate_url on every hop and capping at 5 redirects.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(skills): resolve symlinked final component in file_write/file_ops (re-audit N4/N10)

Both _is_safe_target (file_write) and _is_safe_path (file_ops) realpath-resolved
only the PARENT dir then re-joined the raw basename, so a symlinked FINAL
component (e.g. ~/Documents/x.md -> ~/.zshrc, or x.json -> ~/.codec/
oauth_state.json) passed the blocklist and open() followed it — a write-through
to ~/.zshrc/~/.codec (RCE/secret overwrite) or a read-exfil of secrets over the
MCP-exposed surface, defeating D-4.

Realpath the FULL path in both safety checks (resolves a symlinked basename to
its blocked target), and open()/read()/write() the resolved target so the op
can't be redirected between check and open (TOCTOU). realpath handles
not-yet-existing files. Matches codec_agents._file_write, which already did it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(auth): enforce TOTP on the ?s= session fallback (re-audit N5)

AuthMiddleware's ?s=<token> GET fallback (for mobile img/stream URLs) checked
only token existence + age — NOT totp_verified, unlike the cookie path. Since
auth_verify/auth_pin hand out a usable session token BEFORE TOTP is completed,
a pre-TOTP (or stolen pre-TOTP) token could reach any GET /api endpoint via
?s=, bypassing 2FA.

Extract _session_token_valid() (existence + age + TOTP) and route BOTH the
cookie path (_verify_biometric_session) and the ?s= fallback through it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: vision error-handling + observer sys.path leak + _research_jobs lock/TTL (re-audit N7/N8/N9)

N7: _chat_vision_response runs OUTSIDE chat_completion's try/except, so a
non-200/malformed vision-backend response surfaced as a raw 500. Guard the
POST + parse and return a graceful 502.

N8: codec_observer._ocr_call did sys.path.insert() twice per poll (up to ~5760
entries/day) — and the very next comment says no import happens. Dead + a slow
unbounded leak; removed.

N9: _research_jobs had no lock and no eviction (unlike _agent_jobs post-H-4) —
unbounded memory growth + a dict-changed-size race vs the worker. Add
_evict_stale_research_jobs (+ _research_jobs_lock) mirroring the agent-jobs
pattern; deep_research_start evicts + guards the add/update under the lock.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: builtins AST gap + fail-closed skill-approve + timing-safe PIN (re-audit N19/N20/N21)

N19: add `builtins` to _SKILL_DANGEROUS_MODULES — `import builtins;
builtins.exec(src)` bypassed the gate (builtins.exec is an Attribute call, not
the bare-name Call the dangerous-calls check catches).

N20: _pinned_builtin_names() returned an empty set on a manifest read failure,
so skill_approve's pinned-name guard blocked nothing during a transient
failure. Return None on failure and FAIL CLOSED (refuse the approve).

N21: routes/auth.py compared the PIN hash with `==` (timing side-channel) —
use hmac.compare_digest.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* build: pin CVE-safe dependency floors + fastmcp import-compat guard (Fix #6)

Live-env investigation (read-only + staged, rolled back) found: the deployed
python3.13 runs fastmcp 3.1.1 while requirements declares >=3.3.1, and an
in-place upgrade to 3.3.1 left a broken install state on that interpreter. In a
CLEAN env the code imports fine on 3.3.1 — so the fix is a clean reinstall, NOT
a code port. The fastmcp SSRF (CVE-2026-32871) is in OpenAPIProvider, which
CODEC does not use, so it isn't exploitable here (hygiene, not an emergency).

- requirements.txt: add advisory-safe floors for the transitive/runtime deps
  that were unpinned — urllib3>=2.7.0, pillow>=12.2.0, cryptography>=46.0.7.
  (fastmcp>=3.3.1 and requests>=2.34.2 were already correct; the deployed
  interpreters were just stale vs the declared floors.)
- tests/test_fastmcp_compat.py: assert fastmcp meets the 3.3.1 floor and that
  codec_mcp + codec_oauth_provider import against it — runs in CI and would
  have caught both the version drift and any future fastmcp API move.

NO live mutation: the deployed venvs are untouched (rolled back to their
original state). Refresh them at a maintenance window — see PR runbook.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: skill_forge off chat allowlist + atomic config writes (re-audit mediums)

CHAIN-002: skill_forge writes forged code to disk WITHOUT the review gate, yet
was on CHAT_SKILL_ALLOWLIST — a prompt-injected [SKILL:skill_forge:...] tag
could persist an unsandboxed skill. Removed it (+ the stale ask_codec_to_build
entry that had no backing skill); create_skill stays as the review-gated path.

Atomic config writes (truncated-read → empty-fallback race): route
save_triggers (custom_triggers.json), custom-agent save, and update_schedule
(schedules.json) through codec_jsonstore.atomic_write_json instead of
truncate-then-write, so a concurrent reader can't catch a half-written file.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: design note for the chat/MCP strict-consent gate (re-audit medium)

The Step-3 strict-consent gate lives only in codec_agent_runner; the chat
[SKILL:] + MCP tool paths reach high-power skills with only is_dangerous /
path-blocklists (red-team CHAIN-001/002/006). This >1-module, UX-changing fix
is design-gated per CLAUDE.md §11 — note covers the two chokepoints
(codec_dispatch.run_skill, codec_mcp.tool_fn), the per-transport consent UX
(MCP hard-refuse / chat prompt / voice announce), destructive classification,
test/rollback, and the open decisions. No code until approved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(consent): MCP hard-refuse for destructive skills + classifier (B1/C) — and regen skill manifest

Consent gate, increment 1 (Decisions B1 + C):
- codec_consent: is_destructive_skill (per-skill SKILL_DESTRUCTIVE flag via the
  registry + _HTTP_BLOCKED backstop + known high-power built-ins) + gate_enabled
  kill switch (CONSENT_GATE_ENABLED).
- codec_skill_registry: AST-extract SKILL_DESTRUCTIVE + get_destructive() accessor
  (parallel to SKILL_MCP_EXPOSE) — the extensible per-skill path.
- codec_mcp.tool_fn: hard-refuse destructive skills over MCP (claude.ai can't
  consent at the operator tier) — closes the red-team's file_ops/imessage_send/
  pilot-over-MCP chains; emits a `denied` audit.

Also REGENERATES skills/.manifest.json: the earlier Fix #7 (web_fetch) and
N4/N10 (file_write, file_ops) edits changed those built-ins' bytes, so their
hash-pins were stale — the load-time gate would have AST-refused file_write/
file_ops (they import os) in production, and CI --check would fail. Manifest now
matches the 76 built-ins.

Chat-side consent (A1: consent_required + re-dispatch) is the next increment.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(consent): chat-path consent gate via the AskUserQuestion panel (A2)

Consent gate increment 2 (Decision A2 — reuse the proven Phase 1 Step 3 PWA
consent panel, no new frontend). Both chat skill-runners now gate destructive
skills before firing:
- _try_skill (pre-LLM hijack) and _try_skill_by_name (post-LLM [SKILL:] tag —
  the prompt-injection vector) call codec_consent.chat_consent_ok, which for a
  destructive skill routes through codec_ask_user.ask(destructive=True,
  asked_from="chat") — literal verb-match consent rendered by the existing PWA
  panel; declined/timeout/ask_user-unavailable → skill skipped (fail-closed).
- Non-destructive skills are unaffected (no prompt). Kill switch:
  CONSENT_GATE_ENABLED=false.

With B1 (MCP hard-refuse) this closes the red-team's cross-cutting root cause:
the chat [SKILL:] + MCP paths reached high-power skills with only is_dangerous.
is_dangerous is retained as the layered terminal command heuristic (Decision D).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Mickael Farina <farina.mickael@gmail.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants